tts voice
Super Kawaii Vocalics: Amplifying the "Cute" Factor in Computer Voice
Mandai, Yuto, Seaborn, Katie, Nakano, Tomoyasu, Sun, Xin, Wang, Yijia, Kato, Jun
"Kawaii" is the Japanese concept of cute, which carries sociocultural connotations related to social identities and emotional responses. Yet, virtually all work to date has focused on the visual side of kawaii, including in studies of computer agents and social robots. In pursuit of formalizing the new science of kawaii vocalics, we explored what elements of voice relate to kawaii and how they might be manipulated, manually and automatically. We conducted a four-phase study (grand N = 512) with two varieties of computer voices: text-to-speech (TTS) and game character voices. We found kawaii "sweet spots" through manipulation of fundamental and formant frequencies, but only for certain voices and to a certain extent. Findings also suggest a ceiling effect for the kawaii vocalics of certain voices. We offer empirical validation of the preliminary kawaii vocalics model and an elementary method for manipulating kawaii perceptions of computer voice.
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- Europe > United Kingdom > England > Greater London > London (0.14)
- Asia > Japan > Honshū > Kantō > Kanagawa Prefecture > Yokohama (0.06)
- (15 more...)
- Questionnaire & Opinion Survey (1.00)
- Research Report > New Finding (0.93)
- Research Report > Experimental Study (0.93)
- Media > Music (0.93)
- Health & Medicine (0.88)
- Leisure & Entertainment > Games > Computer Games (0.46)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.46)
- Information Technology > Artificial Intelligence > Speech > Speech Synthesis (0.34)
- Information Technology > Artificial Intelligence > Vision > Optical Character Recognition (0.34)
- Information Technology > Artificial Intelligence > Robots > Robots in the Home (0.34)
Evaluating and Personalizing User-Perceived Quality of Text-to-Speech Voices for Delivering Mindfulness Meditation with Different Physical Embodiments
Shi, Zhonghao, Chen, Han, Velentza, Anna-Maria, Liu, Siqi, Dennler, Nathaniel, O'Connell, Allison, Matarić, Maja
Mindfulness-based therapies have been shown to be effective in improving mental health, and technology-based methods have the potential to expand the accessibility of these therapies. To enable real-time personalized content generation for mindfulness practice in these methods, high-quality computer-synthesized text-to-speech (TTS) voices are needed to provide verbal guidance and respond to user performance and preferences. However, the user-perceived quality of state-of-the-art TTS voices has not yet been evaluated for administering mindfulness meditation, which requires emotional expressiveness. In addition, work has not yet been done to study the effect of physical embodiment and personalization on the user-perceived quality of TTS voices for mindfulness. To that end, we designed a two-phase human subject study. In Phase 1, an online Mechanical Turk between-subject study (N=471) evaluated 3 (feminine, masculine, child-like) state-of-the-art TTS voices with 2 (feminine, masculine) human therapists' voices in 3 different physical embodiment settings (no agent, conversational agent, socially assistive robot) with remote participants. Building on findings from Phase 1, in Phase 2, an in-person within-subject study (N=94), we used a novel framework we developed for personalizing TTS voices based on user preferences, and evaluated user-perceived quality compared to best-rated non-personalized voices from Phase 1. We found that the best-rated human voice was perceived better than all TTS voices; the emotional expressiveness and naturalness of TTS voices were poorly rated, while users were satisfied with the clarity of TTS voices. Surprisingly, by allowing users to fine-tune TTS voice features, the user-personalized TTS voices could perform almost as well as human voices, suggesting user personalization could be a simple and very effective tool to improve user-perceived quality of TTS voice.
- North America > United States > California > Los Angeles County > Los Angeles (0.29)
- Europe > Sweden > Stockholm > Stockholm (0.05)
- North America > United States > New York > New York County > New York City (0.04)
- (2 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Questionnaire & Opinion Survey (1.00)
- Research Report > Strength High (0.68)
- Health & Medicine > Consumer Health (1.00)
- Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (0.86)
The 1st NovelAI Stream & Q&A Summary
First off, thank you all for making the first NovelAI Twitch stream a blast! We've had a fantastic time answering questions, chatting, and seeing your excitement for the future of NovelAI! Now it's time to summarize everything we've covered during the stream! Let's reintroduce the three hosts of the stream: Behind the goose images, we can find Kurumuz, NovelAI's project lead. TabloidA is the designer responsible for the beauty and accessibility of UI. Finally, we have Aini, the community manager.
ReadSpeaker Presents Conversational Neural TTS and New Voice Personas - ReadSpeaker AI
ReadSpeaker continues to expand its broad portfolio of high-quality, lifelike Neural text-to-speech (TTS) voices. Adam communicates with users in a way that feels like a friendly, real-life conversation, and is the perfect fit for speech-enabled Conversational solutions. Click here to hear Adam introducing himself. Our state-of-the-art Neural TTS technology, an AI-powered machine learning model, enables ReadSpeaker's TTS voices to learn natural intonation from real-life speech data, and adjust delivery and speaking style according to specific contexts. We are also excited to introduce two new neural text-to-speech voice personas for Turkish.
Enable read-aloud for your application with Azure neural TTS
Voice is becoming increasingly popular in providing useful and engaging experiences for customers and employees. The Text-to-Speech (TTS) capability of Speech on Azure Cognitive Services allows you to quickly create intelligent read-aloud experience for your scenarios. In this blog, we'll walk through an exercise which you can complete in under two hours, to get started using Azure neural TTS voices and enable your apps to read content aloud. We'll provide high level guidance and sample code to get you started, and we encourage you to play around with the code and get creative with your solution! Read-aloud is a modern way to help people to read and consume content like emails and word documents more easily.